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In this paper we assess the present status of dark matter direct searches by means of Bayesian 
statistics. We consider three particle physics models for spin-independent dark matter interaction 
with nuclei: elastic, inelastic and isospin violating scattering. We shortly present the state of the 
art for the three models, marginalising over experimental systematics and astrophysical uncertain- 
ties. Whatever the scenario is, XENONIOO appears to challenge the detection region of DAMA, 
CoGeNT and CRESST. The first aim of this study is to rigorously quantify the significance of the 
inconsistency between XENONIOO data and the combined set of detection (DAMA, CoGeNT and 
CRESST together), performing two statistical tests based on the Bayesian evidence. We show that 
XENONIOO and the combined set are inconsistent at least at 2a level in all scenarios but inelastic 
scattering, for which the disagreement drops to la level. Secondly we consider only the combined 
set and hunt the best particle physics model that accounts for the events, using Bayesian model 
comparison. The outcome between elastic and isospin violating scattering is inconclusive, with the 
odds 2:1, while inelastic scattering is disfavoured with the odds of 1 : 32 because of CoGeNT data. 
Our results are robust under reasonable prior assumptions. We conclude that the simple elastic 
scattering remains the best model to explain the detection regions, since the data do not support 
extra free parameters. The outcome of consistency tests implies that either a better understanding 
of astrophysical and experimental uncertainties is needed and the strength of belief in certain data 
sets should be revised, either the dark matter theoretical model is at odds with the data. 

PACS numbers: 95.35. -fd, 95.30.Cq 



I. INTRODUCTION 

The last years have seen an intense activity in direct 
searches for dark matter (DM) candidates, in particular 
weakly interacting massive particles (WIMPs). Three 
experiments support a hint of detection in the low DM 
mass regime: CoGeNT Ij, with an excess that follows a 
modulated behavior, CRESST-II [2] (CRESST from now 
on) with 67 events that can not be fully accounted for by 
known backgrounds, besides the thirteen-years signal at 
DAM A/LIBRA [3| (hereafter DAMA), which shows an 
annual modulation compatible with WIMP predictions. 
Alongside these 'signals', stands the series of null result 
experiments, most notably XENONIOO [T (XelOO hence- 
forth), which has the world strongest exclusion limit at 
present. The (in)compatibility between the low mass 
hints themselves and the several upper limits has been 
discussed in a variety of papers, see e.g. [5]-[8] for re- 
cent analyses in both model independent and specific 
DM scenarios. In this study our purpose is to use the 
tools of Bayesian statistics to investigate quantitatively 
the tension between experiments and to find which par- 
ticle physics model provides the best compromise for the 
low mass hints, motivated by the very recent data release 
of XelOO and the fact that an excess is likely still present 
in the new science run of CoGeNT |9j. 

Before heading towards the main intent of the paper, 
however, we wish to extend the Bayesian analysis pursued 
in [10' to the most recent experimental results and to dis- 
tinct particle physics interactions. We employ the same 
procedure as in [10] to include experimental systematics 
in the likelihood and to encompass astrophysical uncer- 



tainties using a motivated DM density profile with the re- 
lated velocity distribution. The inclusion of astrophysical 
uncertainties is becoming a common procedure, starting 
from [TT] for analysis of experimental results to [TH |T3] 
for reconstruction of WIMP parameters and forecasts. 
We consider, in addition to CoGeNT, DAMA and XelOO, 
the CRESST excess and KIMS [H] experiment. It is 
worth to analyse as well the exclusion bounds released 
by bubble chamber experiments, like PICASSO [15] and 
SIMPLE-II 16 . These experiments start to have a to- 
tal exposure sensitive to the cross-sections questioned by 
the low mass hints. Several scenarios of particle physics 
other than elastic spin-independent interaction have been 
proposed, trying to accommodate the exclusion bounds 
and DAMA, CoGeNT, CRESST excesses: e.g. inelastic 
DM [T7] , isospin violating scattering [THl [H] , long range 
forces [316] or composite DM Here, we consider the 
class of spin-independent interaction, namely elastic, in- 
elastic and isospin violating scattering. These are nested 
models: the more complicated models {e.g. with addi- 
tional degrees of freedom) can be reduced to the simplest 
one by fixing at a certain value the extra free parameters. 
We present inference for all the experiments listed above 
to establish the state of the art of current DM direct de- 
tection in each particle physics model considered, having 
marginalised over all nuisance parameters. This will be 
the ground for our Bayesian analysis, explained in the 
following. 

The outcome of parameter inference signals a disagree- 
ment between the detection regions and the exclusion 
bounds 'by eye': every experiment is evaluated sepa- 
rately and then all the contours are displayed together 
in a single plot, showing marginal or no overlap. Firstly 
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we feel that it would be interesting to further investigate 
this tension and to make use of statistical tools to quan- 
tify the degree of inconsistency between XelOO exclusion 
bound and DAMA, CoGeNT and CRESST together (to 
which we will refer hereafter as combined set), in the 
nested model framework described above. Our purpose 
is to re-consider the problem of the tension between all 
these experimental results with two statistical tests: the 
predictive likelihood ratio, or ^-test, and the ^-test, 
after [21] and introduced below. Both tests are based 
on the Bayesian evidence [53] , which is by definition 
the likelihood averaged over parameter space weighted by 
the prior probability of the parameters. These tests are 
therefore performed in data space. For a given set of data 
(XelOO -I- combined set), we allow ourself to change the 
outcome of a subset of it (XelOO data), keeping the rest 
fixed (combined set), to check wether a different observed 
value would improuve or diminish the agreement between 
the whole set. The result of each test will provide the sta- 
tistical significance of the (dis)agreement between XelOO 
data and the detection regions in every particle physics 
scenario. 



Since the tests will point out to an incompatibility be- 
tween XelOO and the detection regions, it does not make 
sense to combine all those experiments together. In the 
second part of this study, we then consider only the de- 
tection regions and apply Bayesian model comparison to 
select which one of the nested particle physics models ex- 
plains better the observations. Indeed, a scientific ques- 
tion that might be asked is about the probability of com- 
peting models under the data. This question can be as- 
sessed in the framework of Bayesian model comparison, 
by means of the Bayesian evidence, which automatically 
incorporates the notion of Occam's razor. Indeed mod- 
els that properly fit the data are rewarded through a 
favourable likelihood function, while models that are un- 
predictive are penalised by the larger parameter volume 
over which the likelihood must be averaged. The use of 
Bayesian model comparison is not so common in particle 
physics, however see e.g. [2T 1 [24H26] . 



The paper is organised as follows. In section [IT] we 
define the statistical framework for Bayesian inference, 
model comparison and consistency checks. The succeed- 
ing section [III| resumes in short the main feature of di- 
rect detection rates and defines the particle physics mod- 



els we wish to compare. In section IV we briefly define 
the likelihood for each experiment we consider and in- 
clude the astrophysical uncertainties, while the details 
are given in appendix [X[ The up-to-date situation for 
DM direct searches is described in |v] (with more details 
in appendix|B]), and we present the outcome for Bayesian 
tests and model comparison in section |VI[ Our conclu- 
sions are summarised in section fVIII 



II. SETUP OF THE STATISTICAL 
FRAMEWORK 

A. Parameter inference 

Given a set of parameters 6 defining a model A4, we 
are interested to compute their posterior probability dis- 
tribution function (pdf) ^(6*1^,7^) via Bayes' theorem, 
namely 



p{e\d,M) = 



£{e\M)p{e\M) 

pid\M) ■ 



(1) 



Here, d are the data under consideration, C{9\Ai) = 
p{d\9, A4) the likelihood function, andp(6'|A^) is the prior 
pdf for the parameters under the model. The quantity 
p{d\M), defined as 



p{d\M)= / £{0\M)p{9\M)d9 , 



(2) 



is called the Bayesian evidence. 

The posterior pdf contains all the necessary informa- 
tion for the interpretation of the data, however typically 
its dimensionality is reduced to n = 1,2 by integrating 
out the m nuisance parameter directions tp for 'graphical' 
purposes, yielding to the so-called marginal posterior pdf 

rn,M,...,e„\d) cx 

/ dV'i...d'0„ V{9i,...,9n,^i,:;ipm\d), (3) 

which is used to construct constraints on the remaining 
parameters as well. 

Provided the data are sufficiently constraining the 
marginal posterior is usually insensitive to the choice of 
prior. For data that can only provide an upper or a lower 
bound on a parameter however, the properties of the in- 
ferred posterior and the boundaries of credible regions 
can vary significantly with the choice of prior as well as its 
limits ^min and ^maxi making an objective interpretation 
of the results rather difficult. This is the case of exclusion 
limits: for them instead of computing credible intervals 
from the fractional volume of the marginal posterior we 
construct intervals based on the volume of the marginal 
posterior in S'-space, where S is the expected WIMP sig- 
nal, using a uniform prior on S with a lower boundary at 
zero [27]. To distinguish these iS-based credible intervals 
from the conventional ones based on the volume of the 
marginal posterior pdf, we label them with a subscript 
S, e.g. 90s%. For more details on this construction we 
refer to \M- 



B. Model comparison 

Bayesian inference is based on the posterior pdf for 
the parameters 9, and it assumes that the model under 
consideration, A4, is the correct one. We can however 
expand the inferential framework to the viability of the 
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model itself or of the relative performance of alterna- 
tive possible models as explanation for the data. The 
formalism of Bayesian model comparison automatically 
balances the quality of the model's fit to the data against 
its predictiveness, that is the best model achieves the 
optimum compromise between quality of fit and predic- 
tiveness and will have the highest posterior probability. 
In this sense, the methodology of Bayesian model selec- 
tion can be interpreted as a quantitative expression of 
the Occam's razor principle of simplicity. The Bayesian 
evidence takes into account the entire allowed range of 
parameters and it incorporates a well defined notion of 
probability for a model against another one. We define 
here the basics, while for a more in-depth discussion see 
e.g. EllMj. 

From equation [T] the posterior odds between two com- 
peting models A^o and TMi are given by 



TABLE I. Jeffreys' scale for grading the strength of evidence 
for two competing models Mo and Mi, adapted from [22ll31j . 



where 



p{Mo\d) p{Mo) 



p{d\Mo) 



(4) 



(5) 



is the Bayes factor, defined as the ratio of the models' ev- 
idences. The Bayes factor B represents an update from 
our prior belief in the odds of two competing models 
p{Mi)/p{Ma) to the posterior odds p{Mi\d) /p{MQ\d). 
If the two models have non-committal prior {p{A4i) = 
p{A4q)) the Bayes factor alone determines the outcome 
of the model comparison. Considering the logarithm of 
the Bayes factor, a positive value means that the model 
A^i is preferred over the model A^o as a description of 
the experimental data, and vice versa. The correspon- 
dence between the actual value of the Bayes factor and 
strength of belief follows the convention set down by Jef- 
freys' scale shown in table [l] 

From the definition of the Bayesian evidence in equa- 
tion ([2|, note how this quantity incorporates the notion 
of Occam's razor and penalises those models with ex- 
cessive complexity unsupported by the data for wasted 
parameter space. Increasing the dimensionality of the pa- 
rameter space without significantly enhancing the likeli- 
hood C{d\9,A4) in the new parameter directions reduces 
the evidence. Unpredictive priors p(0\M), namely exces- 
sively broad compared with the width of the likelihood, 
dilute the evidence as well. Hence a sensitivity analysis of 
the results of Bayesian model selection is necessary, since 
the choice of priors is usually not unique. This analysis 
assesses the dependence of In i? on a reasonable change of 
priors as follows. If the models A4o and Aii are nested 
and their parameter priors separable, then the impact 
of changing the prior width on the Bayes factor may be 
estimated analytically using the Savage-Dickey density 
ratio (SDDR, see [29]). The SDDR ratio depends only 
on the prior of the extra parameter: indeed if the data 
are sufhciently constraining, the marginal posterior pdf 



ln_B Odds Mi : Mo Strength of evidence 

<— 5.0 <1:150 Strong evidence for Mo 

-5.0 -2.5 1 : 150 1 : 12 Moderate evidence for A^o 
-2.5 -1.0 1 ; 12 -> 1 : 3 Weak evidence for Mo 
-1.0 -^1.0 1 : 3 3 : 1 Inconclusive 
1.0 —> 2.5 3 : 1 — > 12 : 1 Weak evidence against Mo 
2.5— >5.0 12:1— >-150:l Moderate evidence against Mo 
> 5.0 > 150 : 1 Strong evidence against A^o 



will exhibit little dependence on the prior, therefore pri- 
ors for common parameters factor out. If the prior of 
the extra parameter is a top-hat function, rescaling its 
width by a factor A will change In B by approximately 
— In A, as a consequence of priors being normalized to 
unity probability content HHj- 

For deciding whether the introduction of new param- 
eters in the theory is necessary, the frequentist approach 
relies on the ^XeS> based on the evaluation of the like- 
lihood at the best-fit point, and p-values, which return 
the probability of observing as extreme or more extreme 
values of the test statistic assuming the null hypothesis 
is true. For sake of reference we give as well the A^^g 
(defined as twice the difference between the best-fit likeli- 
hood values) and the classical p- values, following [551 [55]. 
For the nested models we consider, the extra parameters 
satisfy Chernoff's theorem [30], that is the null hypoth- 
esis sits on the boundary but the additional parameters 
are all defined under the null. The test statistics for the 
p-value is therefore a weighted sum of distributions. 



C. .if -test and ,«-test 

Model comparison is one application of Bayesian model 
selection, while another possibility is quantifying the con- 
sistency between two or more data sets (see e.g. [2T ] [24 l 
155] for particle physics applications). Any obvious ten- 
sion between experimental results is likely to be noticed 
by the 'chi by eye', as it is common practice in direct 
detection analyses. Indeed outcomes from different ex- 
periments may push the model parameters to different 
corners of the parameter space. Here we claim that it 
is important to privilege a method that quantifies these 
discrepancies, as follows. 

A full data set under consideration d can be divided 
into two parts as c? = {^, D}, where ^ is the subset we 
wish to test for compatibility with respect to the remain- 
ing data set D, which we take as reference. The condi- 
tional evidence p{!^\D) is the probability of measuring 
the data ^, knowing that the set D has been measured: 



pmD) 



p{D) 



(6) 
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Here D) is the joint evidence, that is the probabiUty 
of measuring the whole set d within the model under 
investigation. Note that this measure is independent on 
the actual values of the model parameters 9, which have 
been integrated out by definition of evidence. Then p{D) 
is the Bayesian evidence corresponding only to the data 
subset D and is a normalization factor that will cancel 
out. The conditioning on the model M. is understood in 
all the formulas of this section. We then define as 
the observed value for the variable S>. 

The first test we consider is called predictive likelihood 
test or test. The consistency of !^°^^ with the re- 
maining data D is evaluated by comparing p{&°^^\D) 
with the value of that maximises such probability, 
called 



(7) 



The ^ distribution is simply given by the ratio of the 
joint evidences at the observed and maximal value, by 
means of equation [6j This is analogous to a likelihood 
ratio in data space, that is integrated over all possible 
values of the models' parameters. More precisely, we 
evaluate the joint evidence as a function of the possible 
outcome of the observation of the data set & while at 
the same time the set D is kept fixed at its actual value. 
We take the freedom of varying the value of assum- 
ing the same errors on systematics as reported by the 
experiment. Then we measure the relative probability of 
obtaining the observed data realization to the max- 
imum probability of the data set in question. If the out- 
come of the comparison, \nJ^'{^°^^\D), is close to zero 
both data sets are compatible with each other and with 
the model assumptions. If however \n^{^^°^^\D) <C 
there is clearly a tension between D and This means 
that one should doubt the models' assumption or doubt 
& (or vice versa doubt the reference set) and look prop- 
erly for systematics. The .if -test is weakly dependent on 
the prior choice, being a likelihood ratio by definition and 
can be evaluated on a significance scale alike Ax^. 

The second test we perform is the ^-test, called model 
comparison test as well. In this case we test two hypothe- 
ses, again in data space. Suppose that Ho states that all 
the data sets under scrutiny are compatible with each 
other and with the models' assumption. On the contrary 
Hi affirms that the observed experimental outcomes are 
inconsistent so that each data set requires its own set of 
parameter values, since they privilege different regions in 
the parameter space. Then the Bayes factor between the 
two hypotheses, if we have no reason to prefer either Hq 
or Hi, is given by 



p(^°fa^jJ|Ho) 

pi^°^^\Hi)p{D\Hi) 



(8) 



For positive value of \nM{^°^^) the data sets are com- 
patible, while for negative values the alternative hypoth- 
esis Hi is preferred. The strength of evidence against/in 



favour of Ho is assessed by the Jeffreys' scale (table |l| as 
for Bayesian model selection. 

In this paper the data we wish to test by means of the 
^-test is the number of observed events at XelOO exper- 
iment, & = A^eventsj whilc the reference data are given 
by the combined set D = {DAMA, CoGeNT, CRESST}. 
We investigate through the ^-test the hypothesis of com- 
patibility between data sets: Hq believes that XelOO out- 
come is consistent with the combined set, while Hi de- 
notes the incompatibility hypothesis. 

The computation of the evidence p{d\Ai) for each 
model Ai requires the evaluation of an integral over 
the parameter space. We use the ellipsoidal and mul- 
timodal nested-sampling algorithm implemented in the 
publicly available package MultiNest v2.12 [331 [33]. We 
set niivc — 10000, an efficiency factor of 10"''' and a toler- 
ance factor of 0.01 [33], which ensure that the sampling is 
accurate enough to have a parameter estimation similar 
to Markov-Chains Monte Carlo sampling methods. 



III. DIRECT DETECTION RATES AND 
INTERACTION SCENARIOS Mi 

The differential spectrum for a nuclear recoil due to 
scattering of a WIMP, in units of events per time per 
detector mass per energy, has the form 



Pq 



dR 

dE mDM 



v' >v' 



d'v'^v'fiv'it)), (9) 



where E is the energy transferred during the collision, 
Pq = pum{Ro) the WIMP density in the solar neigh- 
bourhood, TTiDM the dark matter mass and da/dE the 
differential cross section for the scattering. f{v'{t)) is 
the WIMP velocity distribution in the Earth's rest frame 
normalised such that J d'^v' f{v'{t)) = 1, which we de- 
scribe in section llVBI 

The total number of recoils expected in a detector in 
a given observed energy range [£'i,£2] is obtained by in- 
tegrating equation ^ over energy 



5(i) = A/detT / dEe{qE) 

JSl/q 



dR 

dE 



(10) 



where M^ot T denotes the detector total mass times 
the exposure time. We have folded into the integral 
an energy-dependent function e{qE) describing the ef- 
ficiency of the detector. The quenching factor q, defined 
via £ = qE, denotes the fraction of recoil energy that is 
ultimately observed in a specific detection channel and 
is a detector-dependent quantity. To distinguish £ from 
the actual nuclear recoil energy E, the former is usually 
given in units of keVee (electron equivalent keV), while 
the latter in keVnr (nuclear recoil keV) or simply keV. 

In our analysis, we consider spin-independent (SI) scat- 
tering off nuclei, encoded in the differential cross-section 
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in the following way: 

A^^Mj^af [hZ+{A-Z)f,)' 
dE 



2 ,/2 



J n 



T'iE), (11) 



where /i„ = mDM^n/ ("Idm + nin) is the WIMP-nucleon 
reduced mass, af^ the spin-independent zero-momentum 
WIMP-nucleon cross-section, Z (A) the atomic (mass) 
number of the target nucleus used, and fp (/„) is the 
WIMP effective coherent coupling to the proton (neu- 
tron). The nuclear form factor J^{E) characterises the 
loss of coherence for nonzero momentum transfer: a fair 
approximation for all nuclei is the Helm form factor I35j . 
We consider three hypotheses for the type of interaction, 
further treated as nested models A4i, as follows. 
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Elastic scattering (Aia) 

This is the standard interaction common to many 
WIMP models. In practice it consists in the fol- 
lowing assumptions. The integration in equation [9] 
is performed over all incident particles capable 
of depositing a recoil energy of E, which implies 
a lower integration limit of v'^^^^ = y/Mf^E/2fi, 
where Ma/ is the mass of the target nucleus, and 
jj, = mr,MM_\f / {niBM + M_\f) is the WIMP-nucleus 



reduced mass. In equation 11 we set /„ — fp, that 
is same coupling to neutron and proton and conse- 
quently the interaction scales as usual as A^. This 
corresponds to scalar interaction, e.g. DM scatter- 
ing off nucleons exchanging a Higgs boson. There 
are two theoretical parameters for the WIMP inter- 
action: mDM and cr^^. For model comparison this 
is the simplest model, called hereafter A^o- 

2. Inelastic scattering (A^i) [T7] 

A WIMP X ma-y scatter off nuclei only by mak- 
ing a transition into an heavier state: x-^ ~^ X*-^- 
The two DM mass eigenstates have a mass splitting 
proportional to Am = S, which is of the order of 
©(keV) in order to the scatter to occur. Only parti- 
cles in the very high tale of the velocity distribution 
will have enough energy to produce a recoil in the 
detector, that translates into a modified minimal 
scattering velocity: 



1 



(MmEr 



2M^Er V ^lr. 



(12) 



Heavy nuclei will be particularly sensitive to this 
interaction, therefore for this scenario we do not 
consider data on Si, F and CI. There are 3 free 
parameters: same as in A^o plus the mass splitting 
5, which we vary with a flat prior between (elastic 
limit) to 200 keV. This model is denoted Ali in 
Bayesian comparison. 

3. Isospin violating scattering {M.2) [H] 

This model relies on the hypothesis that the WIMP 
interaction with the neutron and the proton might 



TABLE II. MultiNest parameters and priors for the WIMP 
parameter space in the three models of SI interaction consid- 
ered in this work. All priors are uniform over the indicated 
range. 



Model 


Parameter 


Prior 


All 


log(mDM/GeV) 


3 


All 




-46 -36 


Inelastic [Mi) 


(5/(keV) 


200 


Isospin violating {M2) 


fn/fp 


-2 1 



be of different strength, namely /„ 7^ fp in equa- 
tion [TT] The minimal velocity is defined as for the 
elastic interaction. The SI cross-section is the mean 
between the one on neutron and the one on proton: 



.SI 



(13) 



Different nuclei isotopes, each with abundance in 
the detector, are taken into account replacing the 
factor with an effective one: 



^off 



^ 2r, [Zfp + {A, - Z)f„f , (14) 

i— isotopes 



following [35]. There are 3 free parameters: the two 
as in Aio plus fn/ fp- We let free to vary this ratio 
from -2 (an asymptotic limit at which all nuclei 
behave the same) to 1 (elastic scattering limit) with 
a flat prior, not to favour any value in particular. 
This model will be referred as A^2- 

The parameters describing the WIMP interaction in 
each model are resumed together with their prior range 
in table [h] The choice for flat/log priors we follow here 
has been discussed in [TOl. 



IV. LIKELIHOOD DEFINITION 

In this section we shortly define the likelihood function 
for CRESST, KIMS and bubble chamber experiments. 
We review the likelihood for XelOO, in light of the re- 
cent data 0] as well. For DAMA and CDMS on Sili- 
con we use the set up defined in [101, while for CoGeNT 
we use the publicly available data, see We do not 

consider CDMS data on Ge, that have been discussed 
extensively in [lOj . since they are less constraining than 
other exclusion bounds considered in this analysis. We do 
not consider the low energy analyses by XENONIO [37] 
and CDMS [3S], as well as the modulated analysis by 
CDMS [39] because of the lack of a reliable parametriza- 
tion of the background making difficult the construction 
of a meaningful likelihood function for our Bayesian anal- 
ysis. 
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We resume all the experiments we consider with their 
nuisance parameters, due to systematics, and their prior 
range in table |III| The details about likelihood construc- 
tion are presented in appendix |Xj At the end of the sec- 
tion we briefly recall how nuisance parameters coming 
from astrophysics are implemented. 



A. Experimental likelihoods 

XENON 100 The likelihood ln/:xcioo is defined 
in [inj, implemented however with the latest data. The 
last scientific run has observed 2 events (A'obs = 2). Ac- 
tually it is precisely A^obs that will be tested under ^ and 
^-tests. We will compute the joint evidence for XelOO 
and the combined set {DAMA, CRESST, CoGeNT}, as 
in equation [6] For this purpose we scan over a finite 
number of realizations under the variable Afovcnts: 



: 0, 10, 60 (100) in intervals of 10 (15) 



plus the evaluation of the joint evidence at A'ovcnts ~ 
Aobs- We choose the maximum numbers of events that 
can be seen by XelOO in 225 live day of run to be 60, 
which is reasonable compared to the forecasts in [T^]. 
We then interpolate between data points with a spline to 
get the joint evidence as a function in data space. 

CRESST The likelihood is constructed on the total 
number of events seen in each detector module and on the 
background modelling given in section 4 of . The yield 
information is not included in the analysis. The back- 
grounds constitute the nuisance parameters, over which 
we marginalise. 

Bubble chamber experiments We consider PI- 
CASSO [15^ and SIMPLE, phase II [Hj. These detectors 
capture phase transitions produced by the energy 
deposition of a charged particle traversing the liquid, if 
the generated heat spike occurs within a certain critical 
length and exceeds a certain critical energy. The event 
is accompanied by an acoustic signal. Therefore the 
detectors perform as threshold devices, controlled by 
setting the temperature T and/or the pressure. The 
relation between the energy threshold Eth{T) and the 
temperature is obtained at a fixed pressure during the 
calibration process. The observed rate per day per kg of 
material is then defined as: 



S 



dEP{E,EtY,{T)) 



dR 



(16) 



where -Bmax is the maximum energy released by a WIMP 
with a certain escape velocity Vcsc and P{E,Et\i{T)) de- 
scribes the effect of a finite resolution at threshold, ap- 
proximated by: 



P{E,Eti,{T)) = 1-exp 



aiT) 1 



E 



Sth(T) 



(17) 



The parameter a{T) defines the steepness of the energy 
threshold, and is a nuisance parameter for both experi- 



TABLE III. MultiNest parameters and priors for experimen- 
tal systematics (nuisance parameters) . All priors are uniform 
over the indicated range. 



Experiment 


Parameter 


Prior 


DAMA 




0.2 0.4 


DAMA 




0.06 0.1 


CoGeNT 


C 


10 cpd/kg/keVee 


CoGeNT 


£o 


30 keVee 


CoGeNT 


Gn 


-> 10 cpd/kg/keVee 


CRESST 


N^ 


5 — > 17 counts 


CRESST 


Cpb 


1 — >■ 7 counts/keV 


CRESST 




3.3 -> 34 counts 


XelOO 


m 


-0.01 0.18 


PICASSO 


a{T) 


1 ^ 11 


SIMPLE 


aiT) 


1 ^ 11 


KIMS 


U 


0.06 0.1 


KIMS 


qcs 


0.06-^ 0.1 


KIMS 


Be 


-> 0.4 



TABLE IV. Astrophysical constraints on the DM halo profile 
and the WIMP velocity distribution. 



Observable 


Constraint 


Local standard of rest 


,,obs _ 


230 ± 24.4 km s"^ gO] 


Escape velocity 




544 ± 39 km s"^ gT] 


Local DM density 


„obs _ 
PQ — 


0.4 ±0.2 GeV cm'^ [12] 


Virial mass 


-i'J vir 


= 2.7 ±0.3 X lO^^M© [43] 



ments. The details on the remaining of the likelihood are 
given in the appendix|^for each collaboration separately. 

KIMS This experiment [Jl] has a binned Gaussian 
likelihood for describing the counts/keV/kg/day seen in 
the detectors, which are compatible with the no detection 
hypothesis. In addition it has three nuisance parameters 
from a background and quenching factors. 



B. Astrophysical uncertainties 

As for the WIMP velocity distribution entering in the 
rate equation |9j we consider two alternatives. For details 
we refer to [inilSS]. 

1. The standard halo model (SMH) 

It is commonly used in direct detection prediction 
for extracting experimental bounds and consists in 
a Maxwellian distribution with fixed astrophysical 
parameters vq, v^sc and pq. We choose to fix the 
parameters at their mean value, as given in ta- 
ble |IV[ It allows to clearly visualise the sensitivity of 
exclusion bounds/detection regions on experimen- 
tal systematics. 



2. DM density profile (NEW) 
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We construct self consistent halo distributions 
starting from a motivated DM density profile, the 
NFW halo distribution [33], as shown in [lU]. 
The DM density profile is constructed from the 
virial mass M^-„ and the concentration parame- 
ter Cvir- Then by means of the Eddington for- 
mula we extract the corresponding velocity distri- 
bution. We marginalise over the nuisance param- 
eters Mvir, Cvir, Wo, Wcsc and pq. The astrophysical 
likelihood is given by 



"0 "CSC 

with gaussian prior centered on the experimental 
measured values quoted in table |IV| The normal- 
ization factor Cnorm = ln(27rcr^^) -I- ln(27rCTp^) -|- 
ln(27r(T^^^__ ) -|- \a{2TTa\.j . ) is fundamental for com- 
puting the evidence. 

Other DM density profiles give similar results on 
the {rriDM, c^^}-plane, namely the exact shape of 
the DM halo density profile, at least within the 
class of spherically symmetric, smooth profiles, 
does not yet play a role in direct DM searches, as 
shown in 110'. Even if it does not capture com- 
pletely the distribution in the galaxies of DM par- 
ticles |45j . it is a fair approximation to consider a 
NFW density profile. 

V. STATE OF THE ART 

The present situation of direct detection experiments 
is shortly illustrated for the three spin-independent inter- 
action models we consider in this work. For more details 
we refer to appendix [Bj 

Elastic SI scattering (model Aio) The 2D marginal 
posterior pdf in the {m-QMi cr^^j-plane for all the individ- 
ual experiments is combined in a single plot in figure [T] 
We first consider the left panel, where the astrophysical 
quantities are fixed at their mean value and only the ef- 
fects of marginalising over systematics appear. One can 
easily recognise the DAMA credible region (shaded), the 
CoGeNT one (red non filled) and the CRESST region 
(blue non filled) with contours at 90% and 99%. All ex- 
clusion bounds are at 90s% confidence level. By means of 
the 'chi by eye', it is apparent that DAMA and CRESST 
are disfavoured at 90s% by XelOO, while CoGeNT is stiU 
partially compatible. On the same foot the PICASSO 
upper limit challenges DAMA, which is incompatible at 
90s%, while being compatible with CoGeNT. AU other 
exclusion limits (as labelled in the caption) are less rel- 
evant for the elastic spin-independent scenario. None of 
the nuisance parameters show an interesting behavior. 

The right panel of figure [T] displays the case of a veloc- 
ity distribution constructed starting from a NFW halo 



TABLE V. ID posterior pdf modes and 90% credible intervals 
for the circular velocity uq, escape velocity Vcbc, and the local 
DM density pQ for NFW density profile considered in this 
work and for the elastic (A^o), inelastic [Mi] and isospin 
violating (^^2) scenarios. 





f() (kms 


Vcsc (kms ^) 


Pq (GeVcm"^) 


DAMA 

CoGeNT 

CRESST 

PICASSO 

XelOO 


219+?^ 
221+t^ 
2211^; 
22ltll 


559 ± 17 

558+1 
558_iB 


"■•^ ' -0.09 
r, 07+0.20 
"■•^ ' -0.08 

0.38t«:lg 
0.38l°:lg 
0.40t«;» 


Ml 

DAMA 

CoGeNT 

CRESST 

KIMS 

XelOO 


221+_:{l 

225+J^ 

222+tJ 

22Qtfi 
990+37 


5581:11 
558lf6 
558tl° 
5581:11 
5581?? 


o.38i«:- 
o.4oig:J«3 

038.1g:lg 
0.38l°:lg 
0.39l°:lt 


M2 

DAMA 

CoGeNT 

CRESST 

PICASSO 

XelOO 


220l?| 
21'Sltfi 

000 + 39 

2211^; 
999+37 


558+l» 

558+?? 
5581?° 
558t?7 


r, 00 + 0.14 
U.J»_o.o9 

o.37lg:i^ 

0.38l°:J| 
0.38lg;lg 
0.39l°:lt 


profile for the dark matter with marginalisation over the 



astrophysical parameters, in addition to the systemat- 
ics. Firstly, we note that allowing for uncertainties in 
the astrophysics significantly expands the closed regions 
of DAMA, CoGeNT and CRESST, while the exclusion 
limits tend to shift a little to the right. This increases 
the compatibility: DAMA, CoGeNT and CRESST cred- 
ible regions overlap now within their 90% contours and 
are partially compatible with both XelOO and PICASSO 
at 90s%. Secondly we note that direct DM searches 
are not at the moment contributing towards constraining 
the astrophysics of the problem. Indeed for a given DM 
halo profile the preferred values for uq, Ucsc and pq and 
their associated uncertainties are virtually independent 
of the additional constraints from the DM experiments. 
As a consequence the experimental systematics follow the 
same trend as for SMH case. For a given DM density pro- 
file, the preferred value for the astrophysical parameter is 
very similar in all the three spin-independent scenarios, 
as confirmed by table |Vj an insight on the astrophysical 
properties of the DM by means of particle physics (and 
vice versa) appears beyond the current potential of direct 
searches. 

In the light of the above considerations, we present 
the other interaction models marginalised over the astro- 
physics. 

Inelastic SI scattering (model Mi) The summary in 
a single plot of all individual experimental outcomes is 
given in figure [2] (left panel) as a function of the dark 
matter mass and scattering cross-section. Same labelling 
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FIG. 1. Elastic spin-independent (SI) scattering. Left: 2D credible regions for the individual experimental bounds and regions 
assuming fixed astrophysical parameters (SMH), combined in a single plot. For DAM A (shaded), CoGeNT (red) and CRESST 
(blue) we show the 90% and 99% contours. The orange line represents the 90s% bound for CDMSSi, the magenta curve is 
for XelOO, the light gray curve stands for SIMPLE, the light green is for PICASSO. The experimental nuisance parameters 
are marginalised over. Right: Same as left with the marginalisation over the astrophysical uncertainties using a NFW density 
profile for the dark matter. Only XelOO (magenta solid) and PICASSO (light green solid) are shown, because they are the 
most constraining ones for the considered scenario. 



as for elastic SI case for detection regions; in this case 
the most constraining experiments are XelOO (magenta) 
and KIMS (green), the only ones shown in the plot. The 
usual Iodine region for DAM A is excluded at 905% by 
both experiments, however there is room for a consis- 
tent explanation at low WIMP mass at 90 s% confidence 
level. This is again a 'chi by eye' consideration, and we 
will show that Bayesian model comparison may come out 
with different results, because of the Occams' razor prin- 
ciple. The exclusion bounds and detection regions are 
affected by a volume effect not only due to astrophysical 
marginalisation but also due to marginalisation over the 
mass splitting parameter S. In appendix |B] the experi- 
mental dependence on it is detailed. 

Isospin violating SI interaction (model Ai2) The 
right panel of figure [l] illustrates the state of the art for 
isospin violating SI scattering (contours/lines labelling in 
the caption). All the three detection regions overlap for 
a^^ ~ 10~^^cm^ and a DM mass of 10 GeV: the data are 
compatible at 90% confidence level. The closed contours 
again are enlarged by volume effects due to marginalisa- 
tion over the isospin violating parameter fn/ fp- Moving 
on the exclusion bounds we see immediately that XelOO 
is the most constraining experiment for DM masses above 
15 GeV while below that value it does find common 
ground for DAMA, CoGeNT and CRESST. This is by 
virtue of the isospin violating interaction, which depletes 
the interaction on Xe whit respect to Na or partially Ge 
in a certain range of fn/ fp ~ —0.7. The low mass regions 
of DAMA, CRESST and CoGeNT are compatible with 



the 905% upper bound of PICASSO as well. By means 
of the 'chi by eye', we could conclude, as in the case of 
inelastic SI scattering, that this particle physics scenario 
accomplishes a better agreement between individual de- 
tection regions among themselves and with the exclusion 
bounds than the elastic SI scenario. We might want to 
confront these statements with the outcomes of Bayesian 
model selection. 

In conclusion at present XelOO is the exclusion bound 
that really challenges the detection regions in all the SI 
scenarios we have considered. In the next section we 
assess rigorously at which statistical significance they are 
(in) consistent within each other. 



VI. RESULTS AND DISCUSSION 



Here we describe the outcomes for Bayesian consis- 
tency tests between XelOO and the detection regions, sec- 
tion EEl We will find that in all scenarios but inelastic 
SI model the inconsistency is at the level of 2cr. It is 
therefore not interesting neither meaningful to attempt a 
global fit: we limit at the detection regions the investiga- 
tion on how direct detection data can constrain particle 



physics models, in section VI B 
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FIG. 2. Left: Same as figure[T]for inelastic SI interaction. The green dark (magenta) curve is the KIMS (XelOO) 90g% exclusion 
bound. The mass splitting direction 5 has been marginalised over. Right: Same as figure [T] for isospin violating interaction. 
The light green (magenta) line denotes the PICASSO (XelOO) exclusion limit at 90s% confidence level. The isospin violating 
parameter f„/fp has been marginalised over. In both panels the astrophysical uncertainties are marginalised over, considering 
a NFW density profile for the dark matter, as well as all the experimental systematics. Only the most constraining exclusion 
limits are shown. Labelling for the closed regions is as in figure [l] 



A. Consistency tests 

Regarding the assessment of compatibility be- 
tween the data sets & = iVovonts and D = 
{DAMA,CoGeNT,CRESST}, we present our predic- 
tions in data space and not anymore in the model pa- 
rameter space, because of the definition of equation |6] 

We first discuss the ^-test. We have considered dif- 
ferent possible outcomes for the observed number of 
events in the XelOO detector, with fbced instrumental 
noise as reported by the collaboration, which is a rea- 
sonable assumption. We have evaluated the conditional 
evidence p(^|Z3) and computed the predictive probabil- 
ity on a grid of values for A'ovcnts- The relevant quantity 
ln^(iVovcnts|£') is plotted in figure[3]as a function of the 
possible outcome of the experimental observation, with 
the actual observed value denoted by a solid black ver- 
tical line. The elastic SI scattering is given in the left 
panel. Consider first the blue line/diamonds: the pre- 
dictive probability grows fast increasing the number of 
events seen in the detector. This indicates that actu- 
ally the compatibility of this experiment with D increases 
augmenting the number of events seen in XelOO. In other 
words a number of events larger than 2 should have been 
observed for ^ and D to be consistent. We see in ad- 
dition that the maximum of the probability depends on 
the maximum number of events we assume have been 
seen. Considering iVmax = 60 the discrepancy between 
the data sets D and ^ is larger than 3(t. Augmenting 
the number of 'observed' events in the detector (green 
line and square, with A^events = 100) would lead to a 



even larger discrepancy. In the right panel, the predic- 
tive probability for the inelastic SI scattering scenario 
(blue/diamonds) has the opposite behavior than Mq: 
the finest agreement between XelOO and the combined 
fit is found for observed events. This actually is sup- 
ported by the parameter inference (discussed below) be- 
cause the combined fit D favours the low DM mass, while 
XelOO inelastic is unable to exclude such region. There- 
fore augmenting the observed number of events leads to 
an increasing inconsistency. We conclude that for in- 
elastic interaction XelOO is compatible within la with 
DAMA, CoGeNT and CRESST and this significance is 
robust against the assumed value of A^max- The isospin 
violating SI scenario (green/squares) follows closely the 
behavior of elastic scattering, although the discrepancy 
in that case is marginal, at the level of 2a, for A^ax = 60. 

Note that this probability distribution does not make 
advantage of the spectral information of the Aovcnts in 
the likelihood {e.g. for a light WIMP the events should 
be concentrated in the low energy part of the detection 
range) and keeps growing by increasing the number of 
observed events. It can be taken therefore as a conserva- 
tive assessment of significance, that may be reduced by 
allowing this extra information. The (in)consistency be- 
tween XelOO and D in the isospin violating scenario may 
be lowered to la level assuming at most 20 events in the 
detector. For the same number of events and elastic SI 
picture, the experimental data sets are still incompatible 
but with a statistical significance of only 2a. 

The ^-test tries to enforce consistency between !^ and 



D: our results are reported in table VI for the actual 
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FIG. 3. Left: Predictive data distribution test) for the number of events Afevonts in XelOO detector, as defined in equation[7| 
The curve represents the conditional evidence of XelOO and the combined set {D = {DAMA, CoGeNT, CRESST}) at a given 
data point, divided by the maximum of the probabihty, for elastic SI interaction. The blue dashed curve is for a maximum of 60 
events, while the green dot-dashed line stands for A'^max ~ 100. The vertical line gives the actual measured value Aobs = 2. The 
data points denote the location at which the predictive probability has been computed and the lines are spline interpolation 
between those points. The horizontal dashed line represents the 1, 2 and 3a significance. Right: Same as left for inelastic {A4i) 
and isospin violating {M2) scenarios, blue and green data points/line respectively. In these scenarios we assumed Amax = 60. 
All astrophysical uncertainties and experimental systematics have been marginalised over, as well as all the model parameters. 



TABLE VI. Results for the test, providing the relative 
odds between the consistency hypothesis "Ho (XelOO and D = 
{DAMA, CoGeNT, CRESST} consistent with each other) and 
the incongruous belief of Hi (XelOO and D inconsistent). Ho 
is favoured for In,^ > 0, while the data sets are in ten- 
sion with each other for In^ < 0. We give the test results 
in the three particle physics scenarios under investigation, 
as labelled. The statistical interpretation is in accordance 
with Jeffreys' scale, given in table |I] and the definition of the 
Si—tesX is given in equation [S] All astrophysical uncertainties 
and experimental systematics have been marginalised over, as 
well as the model parameters. 



Model 


ln^(Aobs = 2) 


Interpretation 


Mo 


-0.32 ±0.07 


Inconclusive evidence against Ho 


Ml 


-0.53 ±0.07 


Inconclusive evidence against Ho 


M2 


-0.22 ±0.07 


Inconclusive evidence against Ho 



number of events of XelOO. In all scenarios, there is 
inconclusive evidence against the hypothesis of compat- 
ibility between XelOO and D. This can be understood 
as follows. This test deems the joint evidence in order to 
make compatible data that come from different regions of 
the parameter space. The joint evidence p{!^, D) is nicely 



unimodal and sharply peaked around 7 GeV in the DM 
mass parameter with cross-section that depends on the 
particle physics scenario. Each of the best fit points are 
fairly compatible with inference for D alone (see figure |6|, 
while individually & has a very broad and flat posterior 
probability distribution. However in order to find a com- 
mon ground the combined set D and the XclOG data need 
to tune the astrophysical parameters: apart from the in- 
elastic model (which is fine as it is, as shown already by 
the ^-test) the preferred local circular velocity is now 
253 km/s, with an escape velocity of 568 km/s and a 
DM density at the solar position of ^ 0.5GeV/cm'^, val- 
ues different from the one in table |Vl Those values are in 
the tail of the distribution of the observed values, as given 
in table |IV] Because of the adjustment of the astrophys- 
ical parameters and the widespread original likelihood of 
XelOO, this test is inconclusive. It is interesting however 
that the astrophysics in this case plays a fundamental 
role. Possibly more sophisticated DM halo models, be- 
sides the smooth and spherically symmetric ones, may 
increase the consistency between data sets. 

These tests can be easily performed for every exclusion 
bound versus the combined set, taking into account the 
time consuming numerical calculations. They are better 
suited for quantifying consistency between data sets that 
a global x^, because definitely the distribution of the test 
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TABLE VII. Bayes factors for the particle physics scenar- 
ios analysed in this work for fixed astrophysical parameters 
(SMH), for the individual detection regions and for the com- 
bined fit. 







\nB 


Experiments 


Ml : Mo 


M2 ■■ Mo 


DAMA 


+0.45 


-0.27 


CoGeNT 


-2.52 


-0.13 


CRESST 


-0.58 


-0.27 


Combined 


-2.38 


-0.7 



statistics for detection limits does not certainly follow a 



B. Model comparison 

The .^-test indicates in general an inconsistency be- 
tween the XelOO exclusion limit and the combined set D, 
with a statistical significance that depends on the particle 
physics model A4i. To answer then to the second ques- 
tion addressed in this paper, what is the best particle 
physics model that can account for the data, we consider 
only the detection regions, individually and combined to- 
gether. 

The main results for Bayesian model comparison are 
the Bayes factors for the nested models Aii (inelastic) 
and 7W2 (isospin violating) versus A4q (elastic). These 
are shown in figure |4j while in table VIII| the correspond- 
ing odds against the simplest model are listed, together 
with the Axcff and the p— values. We recall that both 
Ml and have one extra free parameter with respect 
to A^o, ^ and fn/fp respectively. Astrophysical uncer- 
tainties have been marginalised over. 

We confirm that for nested models the Bayes factor de- 
pends only on the prior of the additional parameter, while 
the ones related to common parameters cancel out. In- 
deed in table [Vn] the Bayes factors for fixed astrophysics 
are shown: they provide strength of evidence alike fig- 
ure |4] where all nuisance and astrophysical parameters 
are marginalised over. 

From figure [4j DAMA is the only experiment which 
shows a positive InB for both A4i and A42- these sce- 
narios are favoured with respect to the elastic SI model, 
even though the evidence is inconclusive in both cases, 
with the odds of only 2 : 1 in favour of the most com- 
plicated models. This is confirmed by the small values 
of Ax^jf, meaning that the additional parameter {S or 
fn/fp) does not actually improuve the quality of the fit. 
Regarding these parameters, from figure [5j we see that 
the marginal ID posterior pdf (blue dashed left panel) 
for 6 has two peaks, one for Na and one for I, while the 
ID posterior pdf for fn/fp denotes a suppression of the 
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FIG. 4. Bayes factors for the particle physics scenarios anal- 
ysed in this work. The experiments are specified on the ver- 
tical axis, while the different symbols refer to the model for 
which the Bayes factors have been computed, as labelled in 
the plot. The numerical value is specified near the data point. 
The Bayes factors have uncertainties of (0.02, 0.03) for the in- 
dividual experiments and ~ 0.07 for the combined analysis. 
Following Jeffrey's scale in table |T] the vertical lines separate 
the distinct empirical gradings of the strength of the evidence. 



TABLE VIII. Odds, Axcs values and corresponding classical 
p- values of the null hypothesis for the different particle physics 
scenarios relative to Mo, elastic SI interaction. We consider a 
NEW density profile for the DM and marginalise over the as- 
trophysical uncertainties and experimental systematics. The 
classical p-values are obtained via Chernoff's theorem with 
one extra parameter in the alternative hypothesis relative to 
the null. 









M, : 


Mo 


Ml Inelastic DM 


odds 


AXoft p-values 


DAMA 


2 


1 


1.95 


0.08 


CoGeNT 


1 


37 


0.87 


0.18 


CRESST 


1 


2 


0.04 


0.42 


Combined 


1 


32 


0.71 


0.20 


M2 Isospin violating DM 










DAMA 


2 


1 


1.88 


0.09 


CoGeNT 


1 


3 


0.12 


0.36 


CRESST 


1 


1 


0.03 


0.43 


Combined 


1 


2 


8.56 


0.002 



interaction for 
panel) . 



1 < fn/fp < —0.5 (blue dashed central 



On the contrary, CoGeNT prefers the simple elastic 
scenario, with weak evidence against A^2 and a moder- 
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FIG. 5. Left: ID marginal posterior pdf for the mass splitting S in the inelastic DM scenario for DAMA (blue dashed), for 
CoGeNT (red dot-dashed), for CRESST (green dotted) and for the combined fit (black solid). Central: Same as left for the 
isospin parameter f„/ fp in M2- Right: ID marginal posterior for the sodium quenching factor gNa for the combined fit in all 
the three particle physics scenarios as follows. The black solid line denotes the elastic SI model, the blue dashed the inelastic 
SI model and the red dot-dashed the isospin violating DM scenario. In all panels the astrophysical uncertainties have been 
marginalised over. 



ate evidence against Mi. In particular, inelastic SI scat- 
tering is disfavoured with the odds of 1 : 37 because a 
large portion of the additional parameter space is wasted 
and the likelihood does not reach enough improvement 
not to be deemed by the unpredictive prior. CoGeNT 
clearly likes light WIMPs with almost elastic collisions 
(the preferred value for (5 is 6 keV) as confirmed by the 
ID marginal posterior pdf in figure [5] (left panel red line). 
We see an example of Occams' razor principle at work: 
the more complicated model is disfavoured because the 
likelihood is not predictive enough to compensate the vol- 
ume increase due to the extra additional parameter. Less 
conclusive is the outcome for the isospin violating model 
with the odds of 1 : 3 against M2, supported by an al- 
most flat /„ / fp in all prior range except for a deep around 
fn/fp ^ —0.7, figure [5] (central panel). 

CRESST indicates inconclusive evidence against both 
Ml and M2- The CRESST data are not able to con- 
strain the nested models with respect to the null hypoth- 
esis, the odds are at most 1 : 2. This is confirmed by the 
broad ID marginal posterior pdf for both S and /„/ / fp 
in figure [5] (left and central panels, green dotted lines). 
The behavior of fn/fp is a consequence of multi-target 
detectors: for instance depending on the atomic element, 
different values of fn/fp might be suppressed, leading in 
complex to an almost flat behavior. 

The outcome of model selection for the combined flt is 
driven by CoGeNT data: indeed In B indicates a moder- 
ate evidence against Mi with the corresponding odds of 
1 : 32. The combined posterior pdf (black solid) follows 
closely the one of CoGeNT (red dot-dashed) in the left 
panel in figure [Sj The 90% and 99% credible regions m 
the {muM, cr^'^}-plane are shown in figure [g] (magenta non 
filled). The inelastic SI scenario favours similar values for 
mass and cross-section as elastic case (shaded region), 
that is rriDM ~ 7 GeV and trf^ ~ 10~^°cm^. One has to 
look along the third direction to check if the agreement 



provides really a good fit to all of the experiments: re- 
garding the astrophysical parameters the preferred values 
are pq = 0.34GeV/cm^, vq — 212km/s, Wosc = 556km/s 
for Mo and vn = 220km/s and pq = 0.37GeV/cm for 
M 1 . This long list of preferred values demonstrates that 
the nuisance parameters select values which are in line 
with the best flt point of the individual experiments. The 
only exception is the Na quenching factor, right panel of 
flgure |5] it peaks at ^ 0.55 for all the particle physics 
models. Even though gNa tends towards a corner of the 
prior range, this value is still compatible with the exper- 
imental allowed range [46 . 

On the contrary of inelastic SI scattering, the evidence 
against is only inconclusive. A frequentist approach 
would have preferred this model with respect to elastic 
SI interaction on the line with the 'chi by eye' outcome 
(as we discussed for flgure [2]). The p-value is 0.002 cor- 
responding to 3(7 against the null, having considered a 
gaussian distribution for the test statistic. This is an 
example of Lindley's paradox (namely Bayesian model 
selection returning a different result from classical hy- 
pothesis testing, see and references therein): looking 
at flgure [5] second panel, the ID posterior pdf for fn/fp 
is sharply peaked around its preferred value, meaning 
that the broad range prior is diluting the evidence for 
M2, contrary to the single experiments, where fn/fp is 
non negligible in all the prior range. The marginal 2D 
posterior pdf in the {ttidmi cr^^ }-plane is given by the red 
contours in flgure |6j and prefers large values of the cross- 
section for a 10 GeV DM mass with respect to the other 
scenarios. Again the astrophysical parameters are in line 
with those of the single experiments. 

Resuming, we argue that the current experimental sit- 
uation disfavours the inelastic DM picture because of Co- 
GeNT data. The p-value of 0.2 corresponds formally to 
a 1.3 a exclusion with respect to the null hypothesis. On 
the other hand the outcome between elastic and isospin 
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FIG. 6. 2D credible regions at 90% and 99% for the com- 
bined data set {D = {DAMA, CRESST, CoGeNT}) for the 
three particle physics scenarios considered in this work, com- 
bined in a single plot. Elastic SI interaction is given by the 
shaded region, the magenta contours (lower non filled region) 
are for the inelastic scenario, while the red curves (upper non 
filled region) are for the isospin violating model. All the nui- 
sance parameters have been marginalised over, as well as the 
additional model parameters. 

violating SI scattering has an inconclusive strength of ev- 
idence, meaning that the complexity due to the extra free 
parameter is not supported yet. 

Our conclusions are robust against changes in prior 
range of the extra free parameter. By means of the SDDR 
we evaluate the impact of changing the prior range of 
the extra free parameter. The odds for a more complex 
model can be made arbitrarily small by increasing the 
width of the priors on the additional parameters or by 
choosing uniform priors on non-linear functions of this 
parameter. Note that a rescaling by a factor of 2 (J : — 
100 keV instead of 200 keV) would still disfavour 
moderately Aii with respect to A4o for CoGeNT. On 
the other hand it can turn it into a positive evidence for 
Ml versus Mo for DAMA and CRESST, although still 
inconclusive. The main conclusion for the combined set 
would still be valid as well. For isospin violating model, 
a reduction in the prior range by a factor of A = 2 would 
still lead to inconclusive evidence between M2 and Mo 
in all experiments. 

VII. CONCLUSIONS 

Currently the direct detection experiments exhibit con- 
trasting outcomes, leading to an ambiguous situation. 
We have applied Bayesian statistical tools to three model 
independent scenarios for spin-independent scattering: 



elastic, inelastic and isospin violating. We have resumed 
the state of the art of these three models using the lat- 
est results of DAMA, CoGeNT, CRESST, XelOO, KIMS 
and bubble chamber experiments; the experimental sys- 
tematic have been carefully modelled in the likelihood. 
We argued that the usual 'chi by eye' consistency test 
may induce to misleading interpretation of consistency 
between data sets in certain cases. 

We therefore have rigorously quantified the tension be- 
tween detection regions at low DM mass (data set D) 
and XelOO exclusion bound (data set f^), by means of 
Bayesian statistical techniques. Using Bayesian evidence 
we have performed two statistical tests that look for in- 
consistency between data sets and the underlying WIMP 
theoretical model. The model comparison test, or ^-test 
leads to inconclusive result, while the predictive likeli- 
hood test has a striking outcome. We have found that 
the inelastic SI scenario is the favoured one under the hy- 
pothesis of a global explanation of both XelOO and the 
combined set. The same data sets appear to be inconsis- 
tent in both elastic and isospin violating models with a 
significance at 3a and 2a level respectively, if a reason- 
able hypothesis on the observed number of events in the 
XelOO detector is made. Notice that the DM halo distri- 
bution plays an important role for the joint set D}: 
the data adjust the values of the astrophysical parameters 
to find a common ground of agreement. The interpreta- 
tion can be twofold: either one has to look for experi- 
mental systematics and/or astrophysical modelling that 
could accommodate both {!^,D} either the discrepancy 
can be seen as an evidence against the DM explanation 
of current data. 

Considering only the detection regions, we have per- 
formed Bayesian model selection to single out the best 
particle physics scenario that phenomenologically accom- 
modates the data sets of DAMA, CoGcNT and CRESST 
individually and in a combined fit. It turns out that the 
isospin violating picture has odds similar to the simplest 
elastic SI interaction: the extra parameter /„//p is not 
supported by the current data. The inelastic SI model is 
disfavoured with the odds of 1 : 32 with respect elastic 
scattering because it does not provide a good fit for Co- 
GeNT, namely it is penalised because of the unpredictive 
broad prior. 

We remark that Bayesian model comparison outcomes 
point somehow towards the opposite direction than the 
consistency picture between XelOO and the combined set. 
In other words the situation is still too tangled to draw a 
conclusive answer; more data are needed as well as public 
likelihoods given by the collaboration in order to properly 
take into account the experimental systematics. 
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Appendix A: Details on experimental likelihoods 

XENONIOO This experiment is currently running at 
Laboratori Nazionali del Gran Sasso in Italy. It has re- 
cently released the scientific run based on 224.6 live days 
of data taking with a fiducial volume for the detector of 
34 kg |4j. The blind analysis, after cuts optimized for DM 
searches, has reported 2 candidate events for WIMP re- 
coils (A^obs — 2) with an expected background of 1 event 
(more precisely the background with its uncertainty is 
B zL aB — 1 ± 0.2). After cuts the total exposure is 
equivalent to 2323.7 kg days, value used in this analysis. 
The likelihood ln>Cxcioo is the same as in [10 , with up- 
dated total exposure and number of observed events, and 
receives contribution from two parts: 

1. ln£ovGnts is the Poisson probability distribution for 
having seen 2 events with a background of 1 event. 
In this analysis we marginalise over the background 
analytically: 



ln£cvcnts(-^obs|'S', B) = - 



S-B 



In 



(Al) 



2. In/^Loff is a Gaussian distribution function that 
models the uncertainty under threshold of L^s, 
which is the conversion factor between nuclear re- 
coil energy E and photo-electron (PE) produced 
in the primary scintillation light (5*1 signal). The 
actual nuisance parameter is called m. 

The detection range for DM in the Si variable is 3 — )■ 20 
PE, contrary to the old run which used 4 30 PE |I7] . 
As already remarked in \10, , our likelihood is an approxi- 
mation of the one provided by the XENONIOO collab- 
oration in |48| . because the spectral informations are 
not available. The 90s% confidence level in the plane 
{wDMiCrf^} corresponds to ^xls < 3.1. 

CRESST The Cryogenic Rare Event Search with Su- 
perconducting Thermometers experiment is located at 
the Laboratori Nazionali del Gran Sasso in Italy. The 
detectors are scintillators made by CaW04 crystals. The 
latest release covers the period between July 2009 and 
March 2011 and collects the data from eight detector 
modules for a total exposure after cuts of 730 kg days. 
The analysis pursued by the collaboration counts 67 
events (TVobs), which can not be all accounted for by 
known background, leading to a hint of detection with 
a statistical significance of more than Aa [2] . 

The discrimination between background and nuclear 
recoil is obtained by the interplay of the phonon chan- 
nel and the scintillation signal. The phonon signal pro- 
vides a measurement of the total energy deposited by the 



interaction, while the scintillation channel serves to dis- 
criminate the type of interaction (different particles give 
a different light yield). However this information is not 
provided by the collaboration. We construct then an ap- 
proximate likelihood based on the total number of events 
in each module plus the total spectral information [2]. 
We suppose that all detector modules have the same to- 
tal exposure, that is 730/8 kg days. The typical energy 
range for DM searches is 12-40 keV, however each detec- 
tor module has is own energy threshold, as detailed in 
table 1 of [2] together with the total number of events 
observed in each module. 

The first part of the likelihood models the total number 
of events seen in each detector module and has the form: 



In -^module 

^lnA«,3|5„5]B,,), 



(A2) 



where the sum runs over all detector modules. In each 
detector the likelihood is given by the Poisson probability 
of observing nl^^^ events for a given WIMP signal S and 
a given background Bi = Bia + Sie/7 + Bin + ^iPb: 



In 



(A3) 



The index j runs over the 4 different sources of back- 
ground defined above, while i = 1, 8 denotes the mod- 
ules. The second part of the likelihood, ln>Cspcctrai, is 
modelled with a Poisson distribution as well and uses 
the spectral information given in figure 5 of [2]. Each 
bin has a width of 1 keV and the energy ranges from 10 
to 40 keV, for a total of 30 bins. 

The identified background sources are: 

1. Leakage of e/7 at low energies, as a total of 8 events 

2. Scattering from a particles, due to the overlap of 
the alpha recoil band with the acceptance region 

(Sa); 

3. Pb recoils due to alpha decay of Polonium at energy 
around 130 keV (i3pb); 

4. Neutron scatterings off Oxygen mainly (-B„). 

The background is a source of systematics and should be 
marginalised over to obtain the credible regions in the 
{moM, cr^^j-plane. The e/7 background is not varied 
and we suppose that in the first energy bin of each module 
it contributes with one event. The a background has 
constant rate in each energy bin and is described by the 
total number of observed a events such that: 



(A4) 
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The contamination due to Pb decay is parametrized as 
equation 1 of [2j: 



to 11. The likeHhood is then defined as: 



dB 



Pb 



dE 



C 



Pb 



0.13 + exp 



90 
13.72 



(A5) 



with the normalization Cpb let free to vary. Finally the 
neutron background is parameterized following equation 
10 in 3|, with a free normalization Nn'- 



exp 



23.54 



exp 



23.54 



(A6) 



where i?min,max are the extreme of each energy bin/range. 
The total likelihood is then: 

ln>CcRESST(^tot|S', S) = ln£niodulc+ln>Cs pcctral 

(A7) 

and depends on the three nuisance parameters from 
background modelling, resumed in table For each 
nuisance parameter we use a Gaussian prior centered 
on the preferred value, as indicated by the collabora- 
tion: ± cr„ = 9.2 ± 2.3, i3„ ± (T„ = 9.7 ± 5.1 and 
Bpb ± CTpb — 19 ± 5. The sum of the Gaussian distri- 
butions gives XwCb- Note that the reported energies are 
already the bolometric ones: we will not be able to fold 
into the Bayesian analysis the uncertainties related to 
the quenching factors. Indeed these have been used by 
the collaboration to define the acceptance region in each 
detector module and for each target nucleus. 

The CRESST commissioning run on W [i Si [SD] is 
constraining part of the parameter space of the CRESST- 
II run, in particular the region at relatively high DM 
mass. We do not however consider it since other bounds 
will reveal to be more stringent. 

PICASSO The experiment (ig is located at SNO- 
LAB, the Canadian underground laboratory in the Vale 
Creighton mine. This search for DM uses superheated 
liquid droplets, a variant of the bubble chamber tech- 
nique, with C4F10 as liquid target material. PICASSO 
has become sensitive to low mass WIMPs, thanks to 
the lightness of the detector material, to the low energy 
threshold (around 1.7 keV) and to the total exposure of 
114 kg days (on ^^F). It was although originally planned 
for investigating WIMP spin-dependent interaction, be- 
cause of its unpaired proton in ^^F. The collaboration 
has estimated that the scattering off ^^C contributes by 
10% for SI interaction, which we take into account. 

Cosmic muons, 7 and /? particles are well separated as 
background, while the main contamination comes from 
neutron and in particular a particles. In our analysis we 
use the data of figure 5 of [15,, which arise from a com- 
bination of all detectors and for which the background 
has already been subtracted. We can not therefore take 
into account the uncertainties due to the a background, 
however we include in the analysis a 5% of uncertain- 
ties from systematics, as quoted by the collaboration. 
The nuisance parameter a{T) is varied with a flat prior 
within its measured experimental range, that is from 1 



In/: 



PICASSO 



^ln(2w: 



I > 1 



(A8) 



where the index i runs over the eight data bins and <Ji are 
the corresponding error bars. The last factor is merely a 
normalization not important for inference however cru- 
cial when computing the Bayesian evidence. The 905% 
confidence level in the plane {ttidm, fn^} corresponds to 
AXe'ff < 4.6. 

SIMPLE-II The Superheated Instrument for Massive 
ParticLe Experiments (SIMPLE hereafter) is operating 
in the Low Noise Underground Laboratory in southern 
France. It consists of 15 superheated droplets detector 
of C2CIF5. As in the case of PICASSO experiment, it is 
well suited to probe the light DM with SI interaction, as 
well as for constraining the spin-dependent cross-section 
for the whole WIMP mass range. 

We neglect the phase I in ^1 and use the most re- 
cent run of 2010, which has an improved neutron shield. 
The final stage of phase II has been released in (TB] and 
encompasses few months of data taking. The total expo- 
sure after cuts is 6.71 kg days, with one event observed 
(A'obs = 1) and a neutron background estimated to be 
B + aB = 2.2±0.3, while the alpha background has been 
estimated negligible. The likelihood is therefore given by 
the Poisson probability of observing A'obs, marginalised 
analytically over the background, as described in [10, : 



\nCsiMpMN\S) ^ -S - B + ^ + ln{S + B - a%) . 

(A9) 

The observed rate is calculated using equation |16[ with 
the parameter a{T) modelled by a Gaussian prior cen- 
tered on its mean value 4.2 and with standard deviation 
of 0.3. The energy threshold is set to 8 keV. The 905% 
confidence level in the plane {ttidm: ""f ^} corresponds to 
Axis < 3.27. 

KIMS The Korea Invisible Matter Search (KIMS) ex- 
periment |14) is running at the Yangyang Underground 
Laboratory in Korea and is made of CsI(Tl) scintillator 
crystals. The collaboration has released the data col- 
lected from September 2009 to August 2010 for a total 
exposure of 24524.3 kg days. We construct a Gaussian 
likelihood based on the counts/keV/kg/day given in fig- 
ure 4 of [14], which arise from the 8 detectors with the 
lowest alpha particle contamination. The energy range 
of the experiment is 3 — 11 keVee. The detectors are 
scintillators, hence the quenching factor of Iodine and Cs 
are two nuisance parameters, which we vary with a fiat 
prior in the allowed experimental range. In addition a 
third nuisance parameter comes from the a background, 
Ba, described by a Gaussian distribution centered on 
Ba±cra^ 0.07±0.02 counts/keV/kg/day (derived from 
table I of [II]). The 905% confidence level in the plane 
{mrtM,cr^^} corresponds to Axlg < 4.6. 
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FIG. 7. Inelastic scattering parameter 5. Left: 3D marginal posterior pdf for {moM, o"„ , S}, where the S direction is represented 
by the colour code, for DAMA. Central and right: Same as left for CoGeNT and CRESST respectively. The astrophysical 
nuisance parameters are fixed at their central value (SMH), while all the experimental systematics are marginalised over. 



Appendix B: Details on parameter inferences 

Here we provide an in-depth discussion about the de- 
pendence of the detection regions on extra free theoretical 
parameters and additional details about each individual 
experiment considered in this work. 

Elastic SI scattering All the comments below refer to 
figure [T] and are applicable both to fixed or marginaUsed 
astrophysics. 

• DAMA: we remember that the ID posterior pdf 
for (7Na is flat all along the prior range, given by 
the measured experimental range |46| . 

• CoGeNT: marginal posterior is nicely multimodal 
and the best fit point is at todm = 7 GeV and 
af^ = 2 X 10-4°cm2. 

• CRESST: our analysis does not provide a closed re- 
gion at large WIMP masses, as in [2], because we 
could not include the yield information in the like- 
lihood, while we agree with other public analyses, 
see e.g. [S]. The wide region is due to volume ef- 
fects because of the marginalisation over the back- 
ground. Since the marginal posterior pdf is highly 
multimodal inference for the best fit point is mean- 
ingless. 

• XelOO: our exclusion limit agrees well with the 
one provided by the collaboration, despite the 
marginalisation over Lgg. The nuisance parame- 
ter m is centered around the best fit measured by 
the XENONIOO collaboration [52Q We attribute 
the strong constraining power at low WIMP mass 
to the low threshold of 3 PE. 



^ The latest measurements of L^fi by XENONIOO has been re- 
leased very recently |53| and shows a flat behavior for Lefl below 
3 keV. We use [S^ however for the analysis, as the XENONIOO 
collaboration. 



• Bubble chambers: PICASSO is more constrain- 
ing than SIMPLE at low WIMP mass. As ex- 
pected both limits become negligible as soon as 
the DM mass gets larger than ^ 20 GeV. We 
have marginalised over the slope of the threshold 
temperature a(T), therefore our bounds are less 
constraining that the one presented by the col- 
laborations. We have although checked that for 
fixed value of a{T) both limits agree well with [15] 
and [Mj. 

• CDMSSi: it is competitive with PICASSO and 
SIMPLE for DM masses below 20 GeV. 

• KIMS: not relevant for this scenario. 

Inelastic SI scattering The comments below refer to 
figure [2] (left panel) and figure [7j and are valid both for 
SMH and marginalised astrophysical case. 

• DAMA: The region at large DM mass is due to 
scattering off Iodine, while the region at ~ 10 GeV 
is due to scattering off Sodium. The DAMA data 
are not constraining enough to select a value for the 
quenching factors, that again has a fiat marginal 
ID posterior pdf. The parameter S has a definite 
trend, as it is depicted in figure[7]left panel: for the 
scattering off Iodine the larger the cross-section the 
larger the mass splitting is, while the small island 
due to Sodium interactions allows only small mass 
splitting of the order 0(10 - 20) keV. 

• CoGeNT: the detection region depends only on 6 < 
20 keV (central panel of figure [7j note the different 
scale of the color bar) and the smaller the cross- 
section the smaller the mass splitting should be in 
order to produce a nuclear recoil. The marginal 
posterior pdf is again the only one which is uni- 
modal and for which we can quote a best fit point: 
moM = 7.7 GeV, crf^ = 4 x lO^-^^cm^ and S = 6.1 
keV. 
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log,„(mpJ (GeV) IoSioI'^dm) (^eV) l°g,o("^DM) (Qf^V) 



FIG. 8. Isospin violating parameter fn/ fp. Left: 3D marginal posterior pdf for {rriDM, o"„ , /„//p}, where the fn/fp direction 
is represented by the colour code, for DAMA. Central and right: Same as left for CoGeNT and CRESST respectively. The 
astrophysical nuisance parameters are fixed at their central value (SMH), while all the experimental systematics are marginalised 
over. 



• CRESST: inelastic SI interactions fit the data in a 
wide range of masses and cross-sections. All values 
of 5 are allowed, as can be seen from the right panel 
figure [7j 

• KIMS: the exclusion bound is less constraining than 
the one quoted by the collaboration as a conse- 
quence of the marginalisation over the quenching 
factors and a background. 

Isospin violating SI scattering The comments below 
refer to figure [2] (right panel) and figure |8j and are valid 
both for SMH and marginalised astrophysical case. 

• DAMA: again two regions are defined, due to the 
multi-target detector, one at small DM masses and 
one for masses ~ 100 GeV. Both regions denote 
the same trend with respect to fn/ fp- the smaller 



the cross-section is, the more negative the fn/fp 
value becomes, as shown by the correlation between 
muM, a-n^ and fn/fp in figure [s] (left panel). 

• CoGeNT: the detection region has a similar de- 
pendence on fn/fp as the DAMA one (central 
panel figure [s]). The values that maximize the uni- 
modal posterior pdf are todm = 7.5 GeV, a^^^ = 
2 X lO-^Ocm^ and fn/fp = 0.6. 

• CRESST: the excess can be explained by a wide 
range of masses and cross-section values and for all 
possible values of fn/fp (right panel in figure [8|. 

• Exclusion bounds: SIMPLE, KIMS and CDMSSi 
are less restrictive for this physical scenario and 
do not show particular features in their nuisance 
parameters. 
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